Performance Analysis of the Kahan-Enhanced Scalar Product on Current Multicore Processors
نویسندگان
چکیده
We investigate the performance characteristics of a numerically enhanced scalar product (dot) kernel loop that uses the Kahan algorithm to compensate for numerical errors, and describe efficient SIMD-vectorized implementations on recent Intel processors. Using low-level instruction analysis and the execution-cache-memory (ECM) performance model we pinpoint the relevant performance bottlenecks for single-core and thread-parallel execution, and predict performance and saturation behavior. We show that the Kahan-enhanced scalar product comes at almost no additional cost compared to the naive (non-Kahan) scalar product if appropriate low-level optimizations, notably SIMD vec-torization and unrolling, are applied. We also investigate the impact of architectural changes across four generations of Intel Xeon processors.
منابع مشابه
Performance analysis of the Kahan-enhanced scalar product on current multi- and manycore processors
SUMMARY We investigate the performance characteristics of a numerically enhanced scalar product (dot) kernel loop that uses the Kahan algorithm to compensate for numerical errors, and describe efficient SIMD-vectorized implementations on recent multi-and manycore processors. Using low-level instruction analysis and the execution-cache-memory (ECM) performance model we pinpoint the relevant perf...
متن کاملOn the accuracy and usefulness of analytic energy models for contemporary multicore processors
This paper presents refinements to the execution-cache-memory performance model and a previously published power model for multicore processors. The combination of both enables a very accurate prediction of performance and energy consumption of contemporary multicore processors as a function of relevant parameters such as number of active cores as well as core and Uncore frequencies. Model vali...
متن کاملMicroprocessor Thermal Analysis using the Finite Element Method
The microelectronics industry is pursuing many options to sustain the performance improvement expected every two years. One method for performance improvement is scaling transistor sizes down such that many more transistors can be compacted on chip. The on-chip temperature is a concern because the reliability and performance can be degraded due to hot spots. Thermal modeling of the chip will al...
متن کاملMulticore Processors : Challenges , Opportunities , Emerging Trends
This paper undertakes a critical review of the current challenges in multicore processor evolution, underlying trends and design decisions for future multicore processor implementations. It is first shown, that for keeping up with Moore ́s law during the last decade, the VLSI scaling rules for processor design had to be dramatically changed. In future multicore designs large quantities of dark s...
متن کاملTechnical Report UPC-DAC-RR-2010-2Decomposable and Responsive Power Models for Multicore Processors using Performance Counters
Power modeling based on performance monitoring counters (PMCs) has attracted the interest of many researchers since it become a quick approach to understand and analyse power behavior on real systems. Moreover, several power aware policies use power models to guide their decisions and to trigger low-level mechanisms -e.g. manage processor frequency-. Hence, the information, the accuracy and the...
متن کامل